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We use Dirichlet form methods to construct and analyze a re- 
, versible Markov process, the stationary distribution of which is the 

Brownian continuum random tree. This process is inspired by the 
subtree prune and regraft (SPR) Markov chains that appear in phy- 
Qh logenetic analysis. 

Q_( ' A key technical ingredient in this work is the use of a novel 

Gromov-Hausdorff type distance to metrize the space whose elements 
are compact real trees equipped with a probability measure. Also, the 
investigation of the Dirichlet form hinges on a new path decomposi- 
tion of the Brownian excursion. 



C*~) ' 1. Introduction. Markov chains that move through a space of finite trees 

are an important ingredient for several algorithms in phylogenetic analysis, 
particularly in Markov chain Monte Carlo algorithms for simulating distri- 
^S) ■ butions on spaces of trees in Bayesian tree reconstruction and in simulated 

CN \ annealing algorithms in maximum likelihood and maximum parsimony tree 

reconstruction (see, e.g., [21] for a comprehensive overview of the field). 
. (Maximum parsimony tree reconstruction is based on finding the phyloge- 

netic tree and inferred ancestral states that minimize the total number of 
obligatory inferred substitution events on the edges of the tree.) Usually, 
such chains are based on a set of simple rearrangements that transform a 
tree into a "neighboring" tree. One widely used set of moves is the nearest- 
neighbor interchanges (NNI) (see, e.g., [6, 7, 9, 21]). Two other standard sets 
of moves that are implemented in several phylogenetic software packages but 
' seem to have received less theoretical attention are the subtree prune and 

regraft (SPR) moves and the tree bisection and reconnection (TBR) moves 
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that were first described in [32] and are further discussed in [6, 21, 30]. We 
note that an NNI move is a particular type of SPR move and that an SPR 
move is a particular type of TBR move and, moreover, that every TBR op- 
eration is either a single SPR move or the composition of two such moves 
(see, e.g., Section 2.6 of [30]). Chains based on other moves are investigated 
in [5, 14, 29]. 

In an SPR move, a binary tree T (i.e., a tree in which all nonleaf vertices 
have degree 3) is cut "in the middle of an edge" to give two subtrees, say 
T' and T" . Another edge is chosen in T' , a new vertex is created "in the 
middle" of that edge and the cut edge in T" is attached to this new vertex. 
Last, the "pendant" cut edge in T 1 is removed along with the vertex it was 
attached to in order to produce a new binary tree that has the same number 
of vertices as T. See Figure 1. 

As remarked in [6], 

The SPR operation is of particular interest as it can be used to model biological 
processes such as horizontal gene transfer and recombination. 




Fig. 1. An SPR move. The dashed subtree tree attached to vertex x in the top tree is 
reattached at a new vertex y that is inserted into the edge (b, c) in the bottom tree to make 
two edges (b,y) and (y,c). The two edges (a,x) and (b,x) in the top tree are merged into 
a single edge (a, b) in the bottom tree. 
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(Horizontal gene transfer is the transfer of genetic material from one species 
to another. It is a particularly common phenomenon among bacteria.) Sec- 
tion 2.7 of [30] provides more background on this point as well as a comment 
on the role of SPR moves in the two phenomena of lineage sorting and gene 
duplication and loss. 

In this paper we investigate the asymptotics of the simplest possible tree- 
valued Markov chain based on the SPR moves, namely the chain in which 
the two edges that are chosen for cutting and for reattaching are chosen 
uniformly (without replacement) from the edges in the current tree. Intu- 
itively, the continuous-time Markov process we discuss arises as limit when 
the number of vertices in the tree goes to infinity, the edge lengths are 
rescaled by a constant factor so that the initial tree converges in a suitable 
sense to a continuous analogue of a combinatorial tree (more specifically, a 
compact real tree), and the time scale of the Markov chain is sped up by an 
appropriate factor. 

We do not, in fact, prove such a limit theorem. Rather, we use Dirichlet 
form techniques to establish the existence of a process that has the dynamics 
one would expect from such a limit. Unfortunately, although Dirichlet form 
techniques provide powerful tools for constructing and analyzing symmetric 
Markov processes, they are notoriously inadequate for proving convergence 
theorems (as opposed to generator or martingale problem characterizations 
of Markov processes, e.g.). We therefore leave the problem of establishing a 
limit theorem to future research. 

The Markov process we construct is a pure jump process that is reversible 
with respect to the distribution of Aldous' continuum random tree (i.e., the 
random tree which arises as the rescaling limit of uniform random trees 
with n vertices when n — > oo and which is also, up to a constant scaling 
factor, the random tree associated naturally with the standard Brownian 
excursion — see Section 4 for more details about the continuum random tree, 
its connection with Brownian excursion and references to the literature). 

Somewhat more precisely, but still rather informally, the process we con- 
struct has the following description. 

To begin with, Aldous' continuum random tree has two natural measures 
on it that can both be thought of as arising from the measure on an approx- 
imating finite tree with n vertices that places a unit mass at each vertex. If 
we rescale the mass of this measure to get a probability measure, then in the 
limit we obtain a probability measure on the continuum random tree that 
happens to assign all of its mass to the leaves with probability 1. We call this 
probability measure the weight on the continuum tree. On the other hand, 
we can also rescale the measure that places a unit mass at each vertex to 
obtain in the limit a cr-finite measure on the continuum tree that restricts 
to one-dimensional Lebesgue measure if we restrict to any path through the 
continuum tree. We call this cr-finite measure the length. 
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The continuum random tree is a random compact real tree of the sort 
investigated in [20] (we define real trees and discuss some of their properties 
in Section 2). Any compact real tree has an analogue of the length measure 
on it, but in general there is no canonical analogue of the weight measure. 
Consequently, the process we construct has as its state space the set of pairs 
(T,i/), where T is a compact real tree and v is a probability measure on T. 
Let \i be the length measure associated with T. Our process jumps away 
from T by first choosing a pair of points (u,v) G T x T according to the rate 
measure ^jl®v and then transforming T into a new tree by cutting off the 
subtree rooted at u that does not contain v and reattaching this subtree at 
v. This jump kernel (which typically has infinite total mass — so that jumps 
are occurring on a dense countable set) is precisely what one would expect 
for a limit (as the number of vertices goes to infinity) of the particular SPR 
Markov chain on finite trees described above in which the edges for cutting 
and reattachment are chosen uniformly at each stage. 

The framework of Dirichlet forms allows us to translate this description 
into rigorous mathematics. An important preliminary step that we accom- 
plish in Section 2 is to show that it is possible to equip the space of pairs 
of compact real trees and their accompanying weights with a nice Gromov- 
Hausdorff-like metric that makes this space complete and separable. We 
note that a Gromov-Hausdorff-like metric on more general metric spaces 
equipped with measures was introduced in [31]. The latter metric is based 
on the Wasserstein L 2 distance between measures, whereas ours is based 
on the Prohorov distance. Moreover, we need to understand in detail the 
Dirichlet form arising from the combination of the jump kernel with the 
continuum random tree distribution as a reference measure, and we accom- 
plish this in Sections 5 and 6, where we establish the relevant facts from 
what appears to be a novel path decomposition of the standard Brownian 
excursion. We construct the Dirichlet form and the resulting process in Sec- 
tion 7. We use potential theory for Dirichlet forms to show in Section 8 that 
from almost all starting points (with respect to the continuum random tree 
reference measure) our process does not hit the trivial tree consisting of a 
single point. 

We remark that excursion path-valued Markov processes that are re- 
versible with respect to the distribution of standard Brownian excursion 
and have continuous sample paths have been investigated in [34, 35, 36], 
and that these processes can also be thought of as real tree- valued diffusion 
processes that are reversible with respect to the distribution of the contin- 
uum random tree. However, we are unaware of a description in which these 
latter processes arise as limits of natural processes on spaces of finite trees. 

2. Weighted K-trees. A metric space (X,d) is a real tree (R-tree) if it 
satisfies the following axioms. 
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Axiom (Completeness). The space (X,d) is complete. 

Axiom 1 (Unique geodesies). For all x,y £ X there exists a unique iso- 
metric embedding (p x ^ y :[0,d(x,y)] — > X such that <f> x ,y(ty = x and 
4>x, y (d(x,y)) = y. 

Axiom 2 (Loop- free). For every injective continuous map ip : [0, 1] — > X 
one has V([0,1]) = ^ (0 )^(i)([0,d(^(0),^(l))]). 

Axiom 1 says simply that there is a unique "unit speed" path between 
any two points, whereas Axiom 2 implies that the image of any injective 
path connecting two points coincides with the image of the unique unit 
speed path, so that it can be reparameterized to become the unit speed 
path. Thus, Axiom 1 is satisfied by many other spaces such as M. d with the 
usual metric, whereas Axiom 2 expresses the property of "treeness" and is 
only satisfied by R rf when d=\. We refer the reader to [12, 15, 16, 17, 33] 
for background on R-trees. In particular, [12] shows that a number of other 
definitions are equivalent to the one above. A particularly useful fact is that 
a metric space (X, d) is an M-tree if and only if it is complete, path-connected 
and satisfies the so-called four point condition, that is, 

d{x\,x 2 ) + d(x 3 ,x 4: ) 

(2.1) 

< max{d(xi,x 3 ) + d(x 2 , x 4 ), d(x\, x 4 ) + d(x 2 , x 3 )} 

for all xi, . . . ,X4 £ X . 

Let T denote the set of isometry classes of compact M-trees. In order to 
equip T with a metric, recall that the Hausdorff distance between two closed 
subsets A, B of a metric space (X, d) is defined as 

(2.2) d u {A,B):=mi{e>0:ACU £ (B) and B C U e (A)}, 

where U £ {C) := {x £ X : d{x,C) < e}. Based on this notion of distance be- 
tween closed sets, we define the Gromov-Hausdorff distance, dcn(X, Y), be- 
tween two metric spaces (X, dx) and (Y, dy) as the infimum of the Hausdorff 
distance dn(X' ,Y') over all metric spaces X' and Y' that are isomorphic 
to X and Y , respectively, and that are subspaces of some common metric 
space Z (cf. [10, 11, 23]). 

A direct application of the previous definition requires an optimal em- 
bedding into a space Z which it is not possible to obtain explicitly in most 
examples. We therefore give an equivalent reformulation which allows us to 
get estimates on the distance by looking for "matchings" between the two 
spaces that preserve the two metrics up to an additive error. In order to be 
more explicit, we require some more notation. A subset 3ft C X x Y is said 
to be a correspondence between sets X and Y if for each x £ X there exists 



G 



S. N. EVANS AND A. WINTER 



at least one y GY such that (x,y) G 5R, and for each y £Y there exists at 
least one x G X such that (x,y) G 5R. The distortion of 3? is defined by 

(2.3) dis(3?) :=sup{\d x (xi,x 2 ) -dy(y 1 ,y 2 )\ : (x 1 ,y 1 ),(x 2 ,y 2 ) G K}- 
Then 

(2.4) d GH ( (X, dx), (Y,dy)) = \ inf dis(SR) , 

where the infimum is taken over all correspondences 3ft between X and Y 
(see, e.g., Theorem 7.3.25 in [11]). 

It is shown in Theorem 1 in [20] that the metric space (T, g?gh) is complete 
and separable. 

In the following we will be interested in compact ]R-trees (T, d) G T equipped 
with a probability measure v on the Borel cx-field B{T). We call such ob- 
jects weighted compact R-trees and write T wt for the space of weight- 
preserving isometry classes of weighted compact R-trees, where we say that 
two weighted, compact R-trees (X, d, v) and {X 1 , d' , z/) are weight-preserving 
isometric if there exists an isometry <j> between X and X' such that the push- 
forward of v by is v': 

(2.5) v = (j)*v := v o 

It is clear that the property of being weight-preserving isometric is an equiv- 
alence relation. 

We want to equip T wt with a Gromov-Hausdorff type of distance which 
incorporates the weights on the trees, but first we need to introduce some 
notions that will be used in the definition. 

An e-(distorted) isometry between two metric spaces (X, dx) and (Y,dy) 
is a (possibly nonmeasurable) map / : X — > Y such that 

(2.6) dis(/) := sup{\d x (xi,x 2 ) - d Y (f(xi), f(x 2 ))\ :xi,x 2 G X} < e 

and f(X) is an e-net in Y. 

It is easy to see that if for two metric spaces (X, dx) and (Y, dy) and e > 
we have dQn((X, dx), (Y, dy)) < s, then there exists a 2e-isometry from X 
to Y (cf. Lemma 7.3.28 in [11]). The following lemma states that we may 
choose the distorted isometry between X and Y to be measurable if we allow 
a slightly bigger distortion. 

Lemma 2.1. Let (X,dx) and (Y,dy) be two compact real trees such that 
dGH.((X , dx) , (Y, dy)) < £ for some e > 0. Then there exists a measurable 
3e-isometry from X to Y . 

Proof. If dcu{{X , dx) , (Y, dy)) < e, then by (2.4) there exists a cor- 
respondence 3? between X and Y such that dis(3i) < 2e. Since (X, dx) is 
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compact there exists a finite e-net in X. We claim that for each such finite 
e-net S x,£ = {x\, . . . ,xn^} C X, any set S Y,£ = {y±, . . . , i/n^} Y such that 
(xi,yi) G 5ft for all i & {1,2, ... , N £ } is a 3e-net in Y. To see this, fix y G Y . We 
have to show the existence of i G {1, 2, . . . , N £ } with dy(yi,y) < 3e. For that 
choose i£l such that (x,y) G Since 

S X,e 

is an e-net in X there exists 
an i £ {1,2,..., N £ } such that dx(xi,x) < e. (xi,yi) G 3ft implies therefore 
that — dy(yi, y)\ < dis(K) < 2e, and hence dyixji, y) < 3e. 

Furthermore we may decompose X into iV e possibly empty measurable 
disjoint subsets of X by letting X 1,e := B(xi,e), X 2,£ :=B(x2,e) \X l,£ , and 
so on, where B(x, r) is the open ball {x' G X : dx(x, x') < r}. Then / defined 
by f(x) = Ui for x G A^ ,e is obviously a measurable 3e-isometry from X to 

y. □ 

We also need to recall the definition of the Prohorov distance between 
two probability measures (see, e.g., [19]). Given two probability measures /i 
and v on a metric space (X, d) with the corresponding collection of closed 
sets denoted by C, the Prohorov distance between them is 

d P (n, v) := inf{e > : //(C) < u{C £ ) + e for all C G C}, 

where C £ := {x £ X : inf^gc* d(x, y) < e}. The Prohorov distance is a metric 
on the collection of probability measures on X. The following result shows 
that if we push measures forward with a map having a small distortion, then 
Prohorov distances cannot increase too much. 

Lemma 2.2. Suppose that (X,dx) and (Y, dy) are two metric spaces, 
f :X — > 1" is a measurable map with dis(/) < e, and \x and v are two proba- 
bility measures on X . Then 

dp(j*/U,./>) < dp((i,v) + £. 

PROOF. Suppose that dp(/j,, v) < 5. By definition, /i(C) < u(C 5 ) + 8 for 
all closed sets C G C. If D is a closed subset of Y, then 

< fi(f~ l (D)) 

(2.7) 

<!/(/-! (£>)V* 
= K/- 1 ( J D) 5 ) + <5- 

Now x' G f~ 1 {D) 5 means there is x" G X such that dx(x',x") < 5 and 
/(x") G .D. By the assumption that dis(/) < e, we have dy(f(x'),f(x")) < 
5 + e, and hence /(#') G D s+£ . Thus 

(2.8) f-^D^Cf-^D 8 ^) 
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and we have 

(2.9) fMD) < v(rHD S+£ )) + 8 = fMD S+£ ) + 5, 
so that dp (f*fJ., f*v) < 5 + e, as required. □ 

We are now in a position to define the weighted Gromov-Hausdorff dis- 
tance between the two compact, weighted M-trees (X, dx,i^x) and (Y, dy,uy). 
For e > 0, set 

(2.10) Ffc,Y := {measurable e-isometries from X to Y}. 
Put 

A GH w t (x,y) 

(2.11) := inf{e > : exist / 6 F XY ,g G such that 

dp(f*vx,VY) < e,dp(v x ,g*VY) < £}■ 

Note that the set on the right-hand side is nonempty because X and Y are 
compact, and hence bounded. It will turn out that Aqjjwi satisfies all the 
properties of a metric except the triangle inequality. To rectify this, let 

(2.12) d GH wt(X,Y) :=inf jE AcHwt^, Zi+O^j, 

where the infimum is taken over all finite sequences of compact, weighted 
M-trees Z\, . . . , Z n with Z\ = X and Z n = Y . 

Lemma 2.3. The map d GH wt : T wt x T wt -> R+ is a metric on T wt . 
Moreover, 

lA GH w t (X, Y) 1 /* < d GH w t (X,y) < A GH wt(X, Y) 1 / 4 
/or T wt . 

Proof. It is immediate from (2.11) that the map A GH wt is symmetric. 
We next claim that 

(2.13) A GH w t ((X, d x ,vx), (Y, dy,uy)) = 0, 

if and only if (X, dx,vx) and (Y, dy,uy) are weight-preserving isometric. 
The "if" direction is immediate. Note first for the converse that (2.13) im- 
plies that for all e > there exists an e-isometry from X to Y, and therefore, 
by Lemma 7.3.28 in [11], d GR ((X , d x ) , (Y, dy)) <2e. Thus d GH ((X,d x ), 
(Y,dy)) = 0, and it follows from Theorem 7.3.30 of [11] that (X,dx) and 
(Y,dy) are isometric. Checking the proof of that result, we see that we can 
construct an isometry / : X — > Y by taking any dense countable set S C X, 
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any sequence of functions (/„) such that f n is an e n -isometry with e n — > as 
n — > oo, and letting / be lim^ f nk along any subsequence such that the limit 
exists for all x G S (such a subsequence exists by the compactness of Y). 
Therefore, fix some dense subset S C X and suppose without loss of gen- 
erality that we have an isometry / :X — > Y" given by f(x) = linv^oo f n (%), 
x G S, where f n G F^y, dp(f w fx, vy) < £n> and liim^oo e n = 0. We will be 
done if we can show that f*fx = vy ■ If fix is a discrete measure with atoms 
belonging to S, then 

dp(f*fx,fy) < limsup[dp(/ n *z/x,^y) + dp(f n *nx, fn*vx) 

n 

(2.14) +dp(f*n Xl fn*fJ-x) + dp(f*f X ,f*Hx)] 
< 2dp(fj, x ,vx), 

where we have used Lemma 2.2 and the fact that lin^^oo dp(f*nx-> /n*A*x) = 
because of the pointwise convergence of f n to / on S. Because we can 
choose fix so that dp (fix, fx) is arbitrarily small, we see that f*fx = fy, 
as required. 

Now consider three spaces (X, dx, fx), (Y, dy, fy) and (Z, dz,fz) hr T wt , 
and constants e,8 > 0, such that A GI jwt ((X, dx, fx), (X,dy, fy)) <£ and 
Agh w * ((Y, dy,fy), (Z, dz, fz)) < 8- Then there exist / G F x Y and 5 G F Y z 
such that dp{f*fx,fy) < £ and dp(g*vy, fz) < 5. Note that g o / G F x + %. 
Moreover, by Lemma 2.2 

(2.15) dp ((5 o f)*f X ,fz) <dp(g*fY,fz) + dp(g*f*vx,g*fy) < 5 + e + 5. 

This, and a similar argument with the roles of X and Z interchanged, shows 
that 

(2.16) A GH wt (X, Z) < 2[A GH w t (X, Y) + A GH wt (Y, Z)\ . 

The second inequality in the statement of the lemma is clear. In order to 
see the first inequality, it suffices to show that for any Zi,...,Z n we have 

n-l 

(2.17) A G Hwt(Z!,Z n ) 1/4 < 2 J2 A G R^(Z i ,Z i+1 ) 1 / 4 . 

i=X 

We will establish (2.17) by induction. The inequality certainly holds when 
n = 2. Suppose it holds for 2, . . . , n — 1. Write S for the value of the sum on 
the right-hand side of (2.17). Put 

{m-l \ 
1 < m < n - 1 : A GH wt (Zi,Z i+1 )V A <S/2\. 
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By the inductive hypothesis and the definition of k, 

k-l 

(2.19) A GH wt(Z 1 ,Z fc ) 1 / 4 < 2J2 A GH wt(Z i ,Z i+1 ) 1 / 4 < 2(5/2) = S. 

i=i 

Of course, 

(2.20) A GH ^(Z k ,Z k+1 ) l / 4 <S. 
By the definition of k, 

k 

(2.21) Yl A GH^(Zi,Zi+i) 1/A >S/2, 

i=i 

so that once more by the inductive hypothesis, 

n-l 

A GH wt(Z fc+1 , Z n ) l / A < 2 ^2 A GH wt(Zj,Z i+ i) 1//4 

i=k+l 

k 

(2.22) =25-2^A GH w t (Z i ,Z m ) 1 / 4 

i=l 

< s. 

From (2.19), (2.20), (2.22) and two applications of (2.16) we have 
A GH wt(Zi,Z n ) 1 / 4 < {4[A GH wt(Zi,Z fc ) + A GH wt(Z fc , Z k+ i) 

(2.23) 

< (4 x 3 x S 4 ) 1 / 4 
<25, 

as required. 

It is obvious by construction that d G nwt satisfies the triangle inequality. 
The other properties of a metric follow from the corresponding properties 
we have already established for A G jjwt and the bounds in the statement of 
the lemma which we have already established. □ 

The procedure we used to construct the weighted Gromov-Hausdorff met- 
ric d G Hwt from the semimetric A G jjwt was adapted from a proof in [24] of 
the celebrated result of Alexandroff and Urysohn on the metrizability of 
uniform spaces. That proof was, in turn, adapted from earlier work of Frink 
and Bourbaki. The choice of the power | is not particularly special; any 
sufficiently small power would have worked. 

Theorem 2.5 below says that the metric space (T , d G jjwt) is complete 
and separable and hence is a reasonable space on which to do probability 
theory. In order to prove this result, we need a compactness criterion that 
will be useful in its own right. 
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Proposition 2.4. A subset D of (T wt ,<i G Hwt) is relatively compact if 
and only if the subset E := {(T, d) : (T, d, v) G D} in (T,c£gh) is relatively 
compact. 

Proof. The "only if" direction is clear. Assume for the converse that E 
is relatively compact. Suppose that ((T n , dx n , ^T n ))neN is a sequence in D. 
By assumption, ((T n ,dT n ))neN has a subsequence converging to some point 
(T,dr) of (T,c2gh)- For ease of notation, we will renumber and also denote 
this subsequence by ((T n , dT n ))n<=N- For brevity, we will also omit specific 
mention of the metric on a real tree when it is clear from the context. 

By Proposition 7.4.12 in [11], for each e > there is a finite e-net T £ in 
T and for each n G N a finite e-net T £ := {x^j' 1 , . . . , in* T "} in T n such that 
dcii(Tn,T e ) -»0 as fi-*oo. Moreover, we take #T £ = #T e = N £ , say, for 
n sufficiently large, and so, by passing to a further subsequence if neces- 
sary, we may assume that i^T £ = #T e = N £ for all n G N. We may then 
assume that T £ and T £ have been indexed so that lim n _ >00 dr n (a;^' 4 ,a^'- : ') = 
dr(x £ '\x £ >i) for 1 < i,j < N £ . 

We may begin with the balls of radius e around each point of T £ and de- 
compose T n into N £ possibly empty, disjoint, measurable sets {T £<1 , . . . , T £,NE } 
of radius no greater than e. Define a measurable map f £ : T n — > T £ by /^(x) = 
x^ if x G T^' 1 and let g £ n be the inclusion map from T £ to T n . By construc- 
tion, and g £ n are measurable e-isometries. Moreover, dp((g £ l )*(fn)*u n ,v n ) < 
e and, of course, dp((f £ )*v n , (f £ )*v n ) = 0. Thus, 

A GH w t ((T £ ,(f £ )*v n ),(T n ,is n ))<e. 

By similar reasoning, if we define h £ n :T £ — > T £ by x^'* \-> x £ ' 1 , then 

hm A GH w t ((T n £ , (/*)*<), (T £ , = 0. 

Since T £ is finite, by passing to a subsequence (and relabeling as before) we 
have 

lim dp((h £ n )*v n ,v £ ) = 

71 — >00 

for some probability measure v e on T £ , and hence 

hm A GH wt ((T £ , (h £ n )*u n ), (T £ ,v £ )) = 0. 

n — too 

Therefore, by Lemma 2.3, 

limsupd G Hwt((T n ,z/ n ), {T £ , {h £ n )*v n )) < e 1/4 . 

n— >oo 

Now, since (T,dr) is compact, the family of measures \v £ :e > 0} is rel- 
atively compact, and so there is a probability measure v on T such that 
i/ £ converges to v in the Prohorov distance along a subsequence e \ and 
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hence, by arguments similar to the above, along the same subsequence 
^GH wt ((T £ , v e ), (T, u)) converges to 0. Again applying Lemma 2.3, we have 
that dcii^t ((T £ , i/ £ ) , (T,u)) converges to along this subsequence. 

Combining the foregoing, we see that by passing to a suitable subsequence 
and relabeling, dcH wt ((T n , v n), (T, v)) converges to 0, as required. □ 

Theorem 2.5. The metric space (T , dgHwt) is complete and separable. 

Proof. Separability follows readily from separability of (T,o!gh) (see 
Theorem 1 in [20]), and the separability with respect to the Prohorov dis- 
tance of the probability measures on a fixed complete, separable metric space 
(see, e.g., [19]), and Lemma 2.3. 

It remains to establish completeness. By a standard argument, it suffices 
to show that any Cauchy sequence in T wt has a convergent subsequence. 
Let (T n ,dT n ,v n )n£N be a Cauchy sequence in T wt . Then (T n ,dT n ) n eN is 
a Cauchy sequence in T by Lemma 2.3. By Theorem 1 in [20] there is a 
T G T such that dGn(T n ,T) — ► 0, as n — ► oo. In particular, the sequence 
(T n , dT n )neN is relatively compact in T, and therefore, by Proposition 2.4, 
(T n ,dT n ,Vn)nan is relatively compact in T wt . Thus (T n , f n )neN has a 
convergent subsequence, as required. □ 

We conclude this section by giving a necessary and sufficient condition 
for a subset of (T,c£gh) to be relatively compact, and hence, by Proposi- 
tion 2.4, a necessary and sufficient condition for a subset of (T wt ,<iGH wt ) to 
be relatively compact. 

Fix (T,d) G T and, as usual, denote the Borel-u-algebra on T by B(T). 
Let 

(2.24) T°= |J }a,b[ 

a,b£T 

be the skeleton of T. Observe that if T' C T is a dense countable set, 
then (2.24) holds with T replaced by T' . In particular, T° G B(T) and 
B(T)\ T o = a({]a, b[; a, b G T'}), where B(T)\ T o := {A n T°; A £ B(T)}. Hence 
there exists a unique u-finite measure /x T on T, called length measure, such 
that n T {T\T°) = and 

(2.25) fj T (]a,b[) = d(a,b) Va,6eT. 

Such a measure may be constructed as the trace onto T° of a one-dimensional 
Hausdorff measure on T, and a standard monotone class argument shows 
that this is the unique measure with property (2.25). 
For e > 0, T G T and p G T write 



(2.26) R £ (T, P ):={xeT:3yeT, [p, y] B x, d T (x, y)>e}U {p} 
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for the e-trimming relative to the root p of the compact M-tree T. Then set 

f f]R e (T,p), diam(T)>e, 

(2.27) R £ (T) := I peT 

[ singleton, diam(T) < e, 

where by singleton we mean the trivial K-tree consisting of one point. The 
tree R £ {T) is called the e-trimming of the compact IR-tree T. 

Lemma 2.6. A subset E of (T^gh) is relatively compact if and only if 
for all e > 0, 

(2.28) sup{p T (R £ {T)):T £ E} < oo. 

PROOF. The "only if" direction follows from the fact that T i-> p T (R £ (T)) 
is continuous, which is essentially Lemma 7.3 of [20]. 

Conversely, suppose that (2.28) holds. Given T £ E, an e-net for R £ {T) is a 
2e-net for T. By Lemma 2.7 below, R £ (T) has an e-net of cardinality at most 
[(§)-V T (i?e( r ))][(§)~V T (^e( T )) + 1]. By assumption, the last quantity is 
uniformly bounded in T £ E. Thus E is uniformly totally bounded and hence 
is relatively compact by Theorem 7.4.15 of [11]. □ 



Lemma 2.7. Let T G T be such that p (T) < oo. For each e > there 

e N 



is an e-net for T of cardinality at most [(§) l p T (T)][(^) 1 p T (T) + 1]. 



Proof. Note that an |-net for R £ /2(T) will be an e-net for T. The 
set T \ R £ /2{T) is a collection of disjoint subtrees, one for each leaf of 
R e / 2 (T), and each such subtree is of diameter at least |. Thus the num- 
ber of leaves of R £ /2{T) is at most (|) _1 /i T (T). Enumerate the leaves of 
R £ /2(T) as xq,x\, . . . ,x n . Each arc [xo,£j], 1 < i < n, of R £ /2(T) has an |- 
net of car dinality at most (|) _1 (i T (xo, x;) + 1 < (f )~ 1 p T {T) + 1. Therefore, 
by taking the union of these nets, R £ /2(T) has an |-net of cardinality at 
most [(|)-V T (T)][(|)-V T (r) + l]. □ 

Remark 2.8. The bound in Lemma 2.7 is far from optimal. It can be 
shown that T has an e-net with a cardinality that is of order p T (T)/e. This 
is clear for finite trees (i.e., trees with a finite number of branch points), 
where we can traverse the tree with a unit speed path and hence think of 
the tree as an image of the interval [0, 2p T (T)] by a Lipschitz map with 
Lipschitz constant 1, so that a covering of the interval [0, 2p T (T)] by e-balls 
gives a covering of T by e-balls. This argument can be extended to arbitrary 
finite length M-trees, but the details are tedious and so we have contented 
ourselves with the above simpler bound. 



14 



S. N. EVANS AND A. WINTER 



3. Trees and continuous paths. For the sake of completeness and to es- 
tablish some notation we recall some facts about the connection between 
continuous excursion paths and trees (see [4, 18, 26] for more on this con- 
nection). 

Write C(1R+) for the space of continuous functions from R + into K. For 
e G C(R+) , put C(e) := inf {t > : e(t) = 0} and write 

r e(0) = 0,C(e)<oo, 

(3.1) U := I e G C(M+) : e(i) > for < t < C(e), 

1 and e(i) = for t > C(e) 

for the space of positive excursion paths. Set U := {e G £7: ("(e) = £}. 

We associate each e 6 f/ 1 with a compact R-tree as follows. Define an 
equivalence relation ~ e on [0, 1] by letting 

(3.2) u\ ~ e U2 iff e{u\) = inf e(ti) = e(u2). 

«£ [ui AU2 ,M1 VH2] 

Consider the following pseudometric on [0, 1] : 

(3.3) d,T e (ui,U2) := e(iii) — 2 inf e(u) + e(w 2 ), 

which becomes a true metric on the quotient space T e := M+|~ e = [0, l]|~ e . 

Lemma 3.1. For each e G U 1 the metric space (T e ,dr e ) is a compact 
R-tree. 

PROOF. It is straightforward to check that the quotient map from [0, 1] 
onto T e is continuous with respect to dx e - Thus (T e ,dr e ) is path-connected 
and compact as the continuous image of a metric space with these properties. 
In particular, (T e ,dx e ) is complete. 

To complete the proof, it therefore suffices to verify the four point condi- 
tion (2.1). However, for u±,U2, 1^3,^4 G T e we have 

max{d Te (ui , u 3 ) + d Te (u 2 , U4) , d Te («i , 1*4) + d Te (u 2 ,u 3 )} 

(3.4) 

> d Te (ui,u 2 ) + d Te (u 3l u A ), 
where strict inequality holds if and only if 
min inf e(u) 

(3.5) 

^ < inf e(u), inf e(u) 

LtlS[MlAlt2,WlVtl2] U£[U3AU4,U3\/U4,] 
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Remark 3.2. Any compact M-tree T is isometric to T e for some e G U . 
To see this, fix a root p G T. Recall R e (T,p), the e-trimming of T with re- 
spect to /? defined in (2.26). Let /x be a probability measure on T that is 
equivalent to the length measure p T . Because p7 is cr-finite, such a proba- 
bility measure always exists, but one can construct p explicitly as follows: 
set H :=max ug T(f(p,u), and put 



pT(R(p,2-*H)) 

i p T {R{p,2~ i H)\R{p,2~ i+l H)n- 
pT(R(p,2-*H)\R(p,2^H)) 



+ E2 



For all < e < H there is a continuous path 

f £ :[0,2p T (R £ (T,p))]^R £ (T,p) 

such that he defined by h e (t) := d(p, f e (t)) belongs to U 2 r r Wr,p)) [in par- 
ticular, f e (0) = f £ (2p T (R £ (T, p))) = p], h £ is piecewise linear with slopes 
±1 and Th e is isometric to R £ (T,p). Moreover, these paths may be chosen 
consistently so that if e' < e" , then 

f £ „(t) = / £ /(inf{s > 0: |{0 < r < s:f £ ,(r) G R E »(T,p)}\ >t}), 

where | • | denotes Lebesgue measure. Now define e £ G U^ Re( - T ' p ^ to be the 
absolutely continuous path satisfying 

It can be shown that e £ converges uniformly to some e G U 1 as e j and 
that T P is isometric to T. 



From the connection we have recalled between excursion paths and real 
trees, it should be clear that the analogue of an SPR move for a real tree aris- 
ing from an excursion path is the excision and reinsertion of a subexcursion. 
Figure 2 illustrates such an operation. 

Each tree coming from a path in U 1 has a natural weight on it: for e G U , 
we equip (T e , dT e ) with the weight ut £ given by the push- forward of Lebesgue 
measure on [0, 1] by the quotient map. 

We finish this section with a remark about the natural length measure on 
a tree coming from a path. Given e G U 1 and a > 0, let 

!e(t) = a and, for some e > 0, "| 
te [0,1]: e(u) >a for all u£]t,t + e[, \ 
e(t + e) = a, J 
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Fig. 2. A subtree prune and regraft operation on an excursion path: the excursion starting 
at time u in the top picture is excised and inserted at time v, and the resulting gap between 
the two points marked # is closed up. The two points marked # (resp. *) in the top (resp. 
bottom) picture correspond to a single point in the associated R-iree. 

denote the countable set of starting points of excursions of the function e 
above the level a. Then /i T % the length measure on T e , is just the push- 
forward of the measure / °° da J2teg a $t by the quotient map. Alternatively, 
write 

(3.7) T e := {(s,a):s e]0,l[,ae[0,e(s)[} 

for the region between the time axis and the graph of e, and for (s, a) E T e de- 
note by s(e, s, a) := sup{r < s : e(r) = a} and s(e, s, a) := inf{t > s : e(t) = a} 
the start and finish of the excursion of e above level a that straddles time s. 
Then /i Te is the push-forward of the measure J Te ds®da ^- - ^p^n s a ) ^s(e,s,a) 
by the quotient map. We note that the measure fi Te appears in [1]. 



4. Uniform random weighted compact M-trees: the continuum random 
tree. In this section we will recall the definition of Aldous' continuum ran- 
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dom tree, which can be thought of as a uniformly chosen random weighted 
compact R-tree. 

Consider the Ito excursion measure for excursions of standard Brownian 
motion away from 0. This a- finite measure is defined subject to a normal- 
ization of Brownian local time at 0, and we take the usual normalization of 
local times at each level which makes the local time process an occupation 
density in the spatial variable for each fixed value of the time variable. The 
excursion measure is the sum of two measures, one which is concentrated on 
nonnegative excursions and one which is concentrated on nonpositive excur- 
sions. Let N be the part which is concentrated on nonnegative excursions. 
Thus, in the notation of Section 3, N is a cr-finite measure on U, where we 
equip U with the cr-field U generated by the coordinate maps. 

Define a map v : U — > U 1 by e i— > e ^ e ^ . Then 



( 4,) m ^Z )n J^> e \- c}} ' re %- 

N{e G U : C(e) > cj 1 

does not depend on c > (see, e.g., Exercise 12.2.13.2 in [27]). The prob- 
ability measure P is called the law of normalized nonnegative Brownian 
excursion. We have 

(4.2) N{eeC/;C(e)edc} = _^_ 

and, defining S c : U 1 — ► U c by 

(4.3) S e e:=yfie(-/c), 
we have 

(4.4) / N(de)G(e) = [°° — J= / F(de)G(S c e) 
J Jo 2V27rc 3 Ju 1 

for a nonnegative measurable function G : U — > R. 

Recall from Section 3 how each e G U 1 is associated with a weighted com- 
pact R-tree (T e ,dT e ,^r e )- Let P be the probability measure on (T wt ,(iGH wt ) 
that is the push-forward of the normalized excursion measure by the map 
e i — > (T2 e , dT 2e , vt 2b ) i where 2e G U 1 is just the excursion path 1 1— > 2e(t). 

The probability measure P is the distribution of an object consisting of 
Aldous' continuum random tree along with a natural measure on this tree 
(see, e.g., [2, 4]). The continuum random tree arises as the limit of a uniform 
random tree on n vertices when n — > oo and edge lengths are rescaled by 
a factor of \j\fn. The appearance of 2e rather than e in the definition 
of P is a consequence of this choice of scaling. The associated probability 
measure on each realization of the continuum random tree is the measure 
that arises in this limiting construction by taking the uniform probability 
measure on realizations of the approximating finite trees. The probability 
measure P can therefore be viewed informally as the "uniform distribution" 
on (T wt ,d GH wt). 
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5. Campbell measure facts. For the purposes of constructing the Markov 
process that is of interest to us, we need to understand picking a random 
weighted tree (T,cIt,vt) according to the continuum random tree distribu- 
tion P, picking a point u according to the length measure fi T and another 
point v according to the weight vt, and then decomposing T into two sub- 
trees rooted at u — one that contains v and one that does not (we are being 
a little imprecise here, because fi T will be an infinite measure, P almost 
surely) . 

In order to understand this decomposition, we must understand the 
corresponding decomposition of excursion paths under normalized excur- 
sion measure. Because subtrees correspond to subexcursions and because 
of our observation in Section 3 that for an excursion e the length mea- 
sure fi Te on the corresponding tree is the push-forward of the measure 
J Te ds (g> da ^ e s a y_ s ( e s a ) $ s (e,s,a) by the quotient map, we need to under- 
stand the decomposition of the excursion e into the excursion above a that 
straddles s and the "remaining" excursion when e is chosen according to the 
standard Brownian excursion distribution P and (s,a) is chosen according 
to the cr-finite measure ds (8) da —, ^— -, r on T e — see Figure 3. 

s(e,s,a)—s(e,s,a) e ° 

Given an excursion e€U and a level a > write: 

(a) C(e) := inf{< > : e(t) = 0} for the "length" of e, 

(b) £f(e) for the local time of e at level a up to time t, 

(c) e^ a for e time-changed by the inverse of t \— > J ds l{e(s) < a} (i.e., e^ a 
is e with the subexcursions above level a excised and the gaps closed up), 

(d) £1(e^ a ) for the local time of e^ a at the level a up to time t, 

(e) U^ a (e) for the set of subexcursion intervals of e above a (i.e., an 
element of W a (e) is an interval / = [gi,di] such that e(gi) = e(dj) = a and 
e(t) > a for gi < t < dj), 

(f ) ATT a (e) for the counting measure that puts a unit mass at each point 
(s',e'), where, for some I G U^ a (e), s' \= l a gi {e) is the amount of local time 
of e at level a accumulated up to the beginning of the subexcursion I and 
e' G U given by 



is the corresponding piece of the path e shifted to become an excursion above 
the level starting at time 0, 

(g) e s,a G U and e s,a G U, for the subexcursion "above" (s,a) G T e , that 

is, 



(5.1) 




0<t<di-gj 
t>di- gi, 




e(s(e, s, a) + t) — a, 



< t < s(e, s, a) — s(e, s, a) 
t > s(e, s, a) — s(e, s, a), 
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Fig. 3. The decomposition of the excursion e (top picture) into the excursion e a ' a above 
level a that straddles time s (bottom left picture) and the "remaining" excursion e s ' a 
(bottom right picture). 

respectively "below" (s,a) £T e , that is, 



(5.3) e s ' a (t) :-- 



e(t), <t < s(e,s,a), 

e(t + s(e, s, a) — s(e, s, a)), t > s(e, s, a), 

(h) a a s (e) := inf{t >0:£f(e) > s} and r s a (e) := mi{t >0:£f(e) > s}, 

(i) e s ' a G U for e with the interval ]o"s (e), r"(e)[ containing an excursion 
above level a excised, that is, 

(.a) ? s, a(f) ._(e(t), 0<t<a«(e), 

1 j {t) -\e(t + T?(e)-a*(e)), t>a^e). 

The following path decomposition result under the cr-finite measure N 
is preparatory to a decomposition under the probability measure P, Corol- 
lary 5.2, that has a simpler intuitive interpretation. 

Proposition 5.1. For nonnegative measurable functions F on M+ and 
G,H on U, 

hide) [ d ° ~ ^ rF(s(e,s,a))G(e°>«)H(e°n 

J Jr e s{e,s,a) — s(e,s,a) 
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N(de) / da M ]a {e){d(s' ,e'))F{a a s ,{e))G{e')H{e s '' a ) 



H l C dsF(s) 



N[G]N 



Proof. The first equality is just a change in the order of integration 
and has already been remarked upon in Section 3. 

Standard excursion theory (see, e.g., [8, 27, 28]) says that under N, the 
random measure en > Af^ a (e) conditional on e i— ► e^ a is a Poisson random 
measure with intensity measure A^ a (e) (8) N, where A^ a (e) is Lebesgue mea- 
sure restricted to the interval [0,O e )] = [0, 2£^ } (e ia )]. 

Note that e s ,a is constructed from e^ a and AA a (e) — 5( s > >e >\ in the same 

way that e is constructed from e^ a and N^ a {e). Also, <r"/(e s '' a ) = cr",(e). 
Therefore, by the Campbell-Palm formula for Poisson random measures 
(see, e.g., Section 12.1 of [13]), 



N(de) I da J j\P a (e)(d(s' , e'))F(a%{e))G(e')H(e s '> a ) 

n{de) J™ dan J J^ a {e)(d(s',e'))F{a a s ,(e))G{e')H{e s '> a ) 

N(de) y daN[G]N <j/ ds'F^e)) j# 
N[G] ^°°da y N(de)^|y d^(e)F(s) J#(e) 



,1" 



N[G] / N(de) 



da / d^(e)F(s) ^iT(e) 



N[G]N 



H / dsF(s) 



□ 



The next result says that if we pick an excursion e according to the 
standard excursion distribution P and then pick a point (s, a) G T e according 
to the cr-finite length measure corresponding to the length measure fi Te on 
the associated tree T e (see the end of Section 3), then the following objects 
are independent: 

(a) the length of the excursion above level a that straddles time s, 

(b) the excursion obtained by taking the excursion above level a that 
straddles time s, turning it (by a shift of axes) into an excursion e s ' a above 
level zero starting at time zero, and then Brownian rescaling e s,a to produce 
an excursion of unit length, 
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(c) the excursion obtained by taking the excursion e s ' a that comes from 
excising e s ' a and closing up the gap, and then Brownian rescaling e s ' a to 
produce an excursion of unit length, 

(d) the starting time s(e, s, a) of the excursion above level a that straddles 
time s rescaled by the length of e s ' a to give a time in the interval [0, 1] . 

Moreover, the length in (a) is "distributed" according to the u-finite mea- 
sure 

(5 - 5) iT^^F- 

the unit length excursions in (b) and (c) are both distributed as standard 
Brownian excursions (i.e., according to P) and the time in (d) is uniformly 
distributed on the interval [0, 1]. 

Recall from (4.3) that S c : U 1 — > U c is the Brownian rescaling map defined 

by 

S c e := v / ce(-/c). 

Corollary 5.2. For nonnegative measurable functions F on M + and K 
on U xU, 

ds ®da / s(e,s,a) \ R ^ s a ^ 



r e s{e,s,a) - s(e,s,a) I ((?>») 

- { I 1 duFM )^s /.tjtts? / nde ' } snde " } Kis ' c '' s '-' e " y 

Proof. For a nonnegative measurable function L on U x U, it follows 
straightforwardly from Proposition 5.1 that 

J V 'Jr e s(e,s,a)-s(e,s,a) \ ((e^) J v ' 
' duF(u)\ [n(de')®~N(de")L(e',e")((e"). 



o 

The left-hand side of (5.6) is, by (4.4), 

ro ° dc 



ds ig) da 

o 2V2vrc 3 J Jr Sce 
(5.7) 

F(s{S c e,s,a)/((STe s ' a ))L(S c e ' ,S c e s ' a ) 



s(S c e, s, a) - s(S c e, s, a) 
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If we change variables to t = s/c and b = a/y/c, then the integral for (s, a) 
over becomes an integral for (t, b) over T e . Also, 

s(S c e, ct, yfcb) = sup jr < ct : y/ce^-^j < \fcb 

(5.8) = csup{r < t :e(r) < b} 

= cs(e, t, b), 

and, by similar reasoning, 

(5.9) s(S c e, ct, y/cb) = cs(e, t, b) 
and 

(5.10) C(^ cWa ) = cC(e'' 6 ). 
Thus (5.7) is 

l~^hl nd ^ c L dmdh 

(5-H) ^ rh 

x FU(e,t,6)/C(e^))L(g Wc ,6^^) 
s(e, t, b) — s(e, i, b) 

Now suppose that L is of the form 

( k 191 r/ p / ^ _ w-t? J v „ M(ge') + C(e")) 

(5.12) L(e,e ) - K(ft c(e / )+c(e » )e ,ft c(e , )+c(e „ )e )— ^=^==p 

where, for ease of notation, we put for e € U, and c> 0, 

(5.13) 7£ c e := S c -ie = —^=e(c-). 



Then (5.11) becomes 

dc 



P(de) / dt ® d6 



2V27TC 3 

(5.14) 

x F(g(e,t,6)/C(e^))i^(e^,e^)M(c) _ 
s(e, i, 6) — s(e, t, b) 

Since (5.14) was shown to be equivalent to the left-hand side of (5.6), it 
follows from (4.4) that 

1 ' Jr e s(e,t,b) - s(e,t,b) \ C(e^) J V ' 



(5.15) 

N(de') ®N(de")L(e' ,e")((e"), 



Jo duF(u) 
N[M] 
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and the first equality of the statement follows. 
We have from identity (5.15) that, for any C > 0, 

N{ C (e) > C} f F(de) f ^ ® d ° - K(e^,e^) 

J Jr e s(e, s, a) — s(e, s, a) 

N(de') <g> N(de ,, )^(^ ?(e0+c(e » ) e / ,^ c(e0+((e , e // ) 

l{C(eQ + C(e / Q>C} 

00 dc' f 00 dc" 



o 



2V27tc /3 Jo 2 v / 2^ 77 

„. l{c' + c" > C} 



x /" P(de') ®F(de")K(Jl d+c nS d e',1l d+c nS c ne 



Make the change of variables p = c i+ c h and £ = c' + c" (with corresponding 
Jacobian factor £) to get 

00 dc' f 00 dc" 



o 



X 



2v / 2ttc /3 2 v / 2^c" 

,J{(/ + (/'>C} 



/" P(de') ®P(de // )^('^ c ' +c »5 c /e / ,^ c / +c //5 c //e // ) : 

i \ 2 /--^ /-i d P c i{e>^} 



2^7 Jo Vo vW 7 ^ 1 
x y P(de / )®P(de")#(<S / >e / ,<Si_, ) e") 
1 \ 2 f /- 00 d£ \ f 1 dp 



J \Jc V^)Jo ^p :i (l-p) 
x y F(de) ®F(de")K(S p e ,Si- p e"), 

and the corollary follows upon recalling (4.2). □ 

Corollary 5.3. (i) For x>0, 

f™, r « f ds®da ( A ,„ I 

/ F(de) / r ; IN max e s ' a > x } 

J Jr e s(e,s,a) - s(e,s,a) Lo<t<C(e s .°) J 

oo 

= 2 nxexp(— 2n 2 x 2 ). 
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(ii) For < p < 1 , 

y Jr e s(e,s,a) — s(e,s,a) y 27rp 

PROOF, (i) Recall first of all from Theorem 5.2.10 in [25] that 

(5.16) pie G C/ 1 : max e(t) > x] = 2 f](4nV - 1) exp(-2nV). 

71=1 

By Corollary 5.2 applied to K(e', e") := l{max ig [ ^( e ')] e'(t) > x} and F = 1, 
fade)/ , ,l( max e^>x} 

= — ^= / ; f P x p ( max y/pe(t/p) > x \ 
2^ Jo JpMl-p) \mo,py J 



vV(l - 


■p) 


dp 




vVU " 


p) 


dp 




vV(l - 


■p) 



Z 1 . ..ff , p( max e(t) > 
7o Vp 3 (l-p) U[o,i] 



= 2 nxexp(— 2n 2 x 2 ), 

n=l 

as claimed. 

(ii) Corollary 5.2 applied to K(e',e") := l{C(e') >p} and F = 1 immedi- 
ately yields 

/P(*)/ _. *** , !<««")> J.) 
J Jr c s(e, s, a) — s{e, s, a) 

1 f 1 dp ll-p 



2^ J P VP 3 (1-P) V 27r P ' D 

We conclude this section by calculating the expectations of some func- 
tional with respect to P [the "uniform distribution" on (T wt , dQH wt ) as 
introduced in the end of Section 4] . 

For T G T wt , and p G T, recall R c (T,p) from (2.26), and the length mea- 
sure fM T from (2.25). Given (T,d) G T wt and u,v£T, let 

(5.17) S T ' U ' V :={weT:ue]v,w[} 

denote the subtree of T that differs from its closure by the point u, which 
can be thought of as its root, and consists of points that are on the "other 
side" of u from v (recall ]u,w[ is the open arc in T between v and w). 
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Lemma 5.4. (i) For x > 0, 

P[^ T ® i^r{(«,t;) G T x T: height (5 T '"' 1 ') > x}] 



r 



v T (dv)fi T (R x (T,v)) 
2 raxexp(— n 2 x 2 /2). 



n=l 



(ii) For 1 < a < oo, 



i*r(dv) / ^ T (^)(height(5 T ' u - v )) 



T,u,v\ \a 



C(«) 



where, as usual, :=X)n>i n 
(iii) For 0<p< 1, 



'2(1 -p) 



7rp 



P[^ T ® z^ T {(u, v)€TxT: u t (S T ' u ' v ) > p}] 
(iv) For \ <(3<oo, 

2 _ 1/2 r(/?-i/2) 



Proof, (i) The first equality is clear from the definition of R X (T, v) and 
Fubini's theorem. 

Turning to the equality of the first and last terms, first recall that P is 
the push- forward on (T wt , d^Hwt) of the normalized excursion measure P 
by the map e i— ► (T2 e , fT 2e ), where 2e E U 1 is just the excursion path 
1 1 ^ 2e(t). In particular, T2 e is the quotient of the interval [0, 1] by the equiv- 
alence relation defined by 2e. By the invariance of the standard Brownian 
excursion under random rerooting (see Section 2.7 of [3]), the point in T2 e 
that corresponds to the equivalence class of € [0,1] is distributed according 
to h / T 2e when e is chosen according to P. Moreover, recall from the end of 
Section 3 that for e E U l , the length measure fi Te is the push- forward of the 
measure ds <g> da ^ - a y_ s ^ e s a ) £ s (e,s,a) on the subgraph T e by the quotient 
map defined in (3.2). 

It follows that if we pick T according to P and then pick (u,v) £ T x T 
according to n T (8> ut, then the subtree S T ' U ' V that arises has the same a- 
finite law as the tree associated with the excursion 2e s,a when e is cho- 
sen according to P and (s,a) is chosen according to the measure ds ® 

da s(e,s,a)-s(e,s,a) 6 s(e,s,a) OU the Subgraph T e . 
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Therefore, by part (i) of Corollary 5.3, 



v T {dv) / n T (du)l{height(S T ' u ' v ) > x} 
Jt 



2 J P(de) J 



ds ® da 



1 



lr e s(e, s, a) — s(e, s, a) [o<t<((e s ' a ) 

oo 

2 ?ixexp(— n 2 x 2 /2). 



n=l 



Part (ii) is a consequence of part (i) and some straightforward calculus. 
Part (hi) follows immediately from part (ii) of Corollary 5.3. 
Part (iv) is a consequence of part (hi) and some more straightforward 
calculus. □ 



6. A symmetric jump measure on (T wt , dGH wt )- I n this section we will 
construct and study a measure on T wt x T wt that is related to the decom- 
position discussed at the beginning of Section 5. 

Dehne a map from {((T, d), u, v) : T E T, u E T, v E T} into T by setting 
@({T,d),u,v) := (T,d^), letting 

iix,yeS T ' u ' v , 
iix,yeT\S T ' u ' v , 
if x£S T ' u ' v ,y£T\S T ' u ' v , 
ify£S T ' u ' v ,x£T\S T ' u ' v . 

That is, 0((T, d),u, v) is just T as a set, but the metric has been changed 
so that the subtree S T,U ' V with root u is now pruned and regrafted so as to 
have root v. 

If (T, d, v) E T wt and (u, v) E T x T, then we can think of v as a weight on 
(T, d( u ' v ^), because the Borel structures induced by d and d( u > v ) are the same. 
With a slight misuse of notation we will therefore write @((T, d, v),u, v) for 
(T, i/) E T wt . Intuitively, the mass contained in S T ' U,V is transported 
along with the subtree. 

Dehne a kernel k on T wt by 

(6.2) k{(T, dr, vt), B) := /i T ® ^{(u, v)eTxT: 9(T, u, u) E B} 

for B E fi(T wt ). Thus k((T, dy, z^t), •) is the jump kernel described informally 
in the Introduction. 



(d(x,y), 
d(x,y), 
d(x,u) + d(v,y), 
d(y,u) + d(v,x), 



Remark 6.1. It is clear that n((T,dT,vr),-) is a Borel measure on T wt 
for each (T,dr,i^r) E T wt . In order to show that k(-,B) is a Borel function 
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on T wt for each B € B(T wt ), so that k is indeed a kernel, it suffices to 
observe for each bounded continuous function F : T wt — > R that 

J F(e{T,u,v))fi T {du)u T (dv) 

= lim / F(e(r,u,u))/x fl£ ^(^)i/ T (di;) 

ej.0 J 

and that 

(T, d T ,v T )^ J F{®{T,u,v))n Re( ?Xdu)v T (dv) 

is continuous for all e > (the latter follows from an argument similar to that 
in Lemma 7.3 of [20], where it is shown that the (T,cbr,i*r) i— ► p Re ^ T '{T) is 
continuous). We have only sketched the argument that k is a kernel, because 
k is just a device for defining the measure J on T wt x T wt in the next 
paragraph. It is actually the measure J that we use to define our Dirichlet 
form, and the measure J can be constructed directly as the push-forward of 
a measure on U 1 x U 1 — see the proof of Lemma 6.2. 

We show in part (i) of Lemma 6.2 below that the kernel k is reversible 
with respect to the probability measure P. More precisely, we show that if 
we define a measure J on T wt x T wt by 

(6.3) J(AxB):= / P(dT) K {T,B) 

J A 

for A,B £ B(T wt ), then J is symmetric. 

Lemma 6.2. (i) The measure J is symmetric. 

(ii) For each compact subset K C T wt and open subset U such that K C 
U C T wt ; 

J(K,T wt \U)<oo. 

(iii) The function Aqjj w,; * s square-integrable with respect to J , that is, 

f J(dT, dS)A 2 GHwt (T, S) < oo. 

Proof, (i) Given e',e" G U 1 , < u < 1 and < p < 1, define e°(-,ef,ef', 
u,p) G U 1 by 

e°(t;e',e",u,p) 

( 6 - 4 ) rcSi_ p e"(t), 0<t<(l-p)«, 

= I Si„ p e"((l - p)u) + S p e'{t - (1 - p)u), 

' | (l-p)«<t<(l-p)u + ^, 

[St-pef'it-p), {l-p)u + p<t<\. 
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That is, e°(-; e' , e" , u, p) is the excursion that arises from Brownian rescaling 
e' and e" to have lengths p and 1 — p, respectively, and then inserting the 
rescaled version of e' into the rescaled version of e" at a position that is a 
fraction u of the total length of the rescaled version of e" . 
Define a measure J on U 1 x U 1 by 

/ S(de*,de**)K(e*,e**) 

(6.5) := / du ® tfo — L= / P / P(de') ® F(de") 

J[o,i} 2 2V27T Jo VI 1 _ P)P J 

xK(e°(-,e',e",u,p),e°(-,e',e",v,p)). 

Clearly, the measure J is symmetric. It follows from the discussion at the 
beginning of the proof of part (i) of Lemma 5.4 and Corollary 5.2 that the 
measure J is the push-forward of the symmetric measure 2J by the map 

U 1 x U 1 3 (e*,e**) ^ ((T 2e * , dr 2e * , *T 2e . ), (T 2e « , dT 2et . , ^T 2e ,* )) G T wt x T wt , 

and hence J is also symmetric. 

(ii) The result is trivial if K = 0, so we assume that Since T wt \ U 

and K are disjoint closed sets and K is compact, we have that 

(6.6) c:= inf A GH wt (T, S) > 0. 

Fix T G K. If (u, v) G T X T is such that A GH wt (T, 0(T, ti, u)) > c, then 
diam(T) > c [so that we can think of R C (T), recall (2.27), as a subset of T]. 
Moreover, we claim that either: 

(a) ueR c (T,v) [recall (2.26)], or 

(b) u i Rc(T,v) and u T (S T ' u ' v ) > c [recall (5.17)]. 

Suppose, to the contrary, that u £ R c (T,v) and that ut(S T ' u,v ) < c. Be- 
cause u ^ R C (T, v), the map / : T — > 0(T, n, given by 

u, if «; € 5 T ' lt - ,; , 
w, otherwise, 



f(w) :-- 



is a measurable c-isometry. There is an analogous measurable c-isometry 
g:@(T,u,v) ->T. Clearly, 

d P {hv T ^ T ' u ^)<c 

and 

d P (z/ T , ff V e ( T ^))< C . 
Hence, by definition, A GH wt(T, @(T, u,v)) < c. 
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Thus we have 

J(K,T wt \U) 

< f P{dT}K(T,{S:A GU ^(T,S)>c}) 
Jk 

(6.7) < / P(cff) f v T (dv)fx T (R c (T,v)) 

Jk Jt 

+ / P(dT) f v T (dv)fj, T {u£T:u T (S T ' u > v )>c} 
Jk Jt 



< oo, 



where we have used Lemma 5.4. 
(iii) Similar reasoning yields that 



/ J(dT, dS)A GU „ t (T, 5) 

JT wt xT wt 

r poo 

= P{dT} dt2tK(T,{S:A GU ^(T,S)>t}) 

JT wt JO 

/•oo r 

P(dT) / dtltl v T (dv)^ T {R c {T,v)) 
* Jo Jt 



(6.8) + / P(dT) / dt2t / i/ T (*)/i T {«er:^{S Tw }>f} 

JT wt jo Jt 

< dt2t P(cff) / v T {dv)n T {R c (T,v)) 

Jo JT wt JT 

+ / P(dT) / ^(du) / fx T (du)u^(S T ' u ' v ) 

J T wt J T Jj- 

< OO, 

where we have applied Lemma 5.4 once more. □ 
7. Dirichlet forms. Consider the bilinear form 

£(/,<?) 

(7 " 1} := / J(dT,dS)(f(S) — f(T))(g(S) — g(T)), 

for /, g in the domain 

(7.2) V*(£) := {/ G L 2 (T wt ,P) :/ is measurable, and £(/,/) < oo} 

[here as usual, L 2 (T wt ,P) is equipped with the inner product (f,g)p := 
/ P(dx) x f(x)g(x)]. By the argument in Example 1.2.1 in [22] and Lemma 6.2, 
(£,£>*(£)) is well defined, symmetric and Markovian. 
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Lemma 7.1. The form (£,T>*(£)) is closed. That is, if (/ n ) n gN "is a 
sequence inT>*(£) such that 

lim (£(f n - f m , f n - f m ) + (f n - f m , f n - f m )p) = 0, 

then there exists f £T>* (£) such that 

lim (£(/„ -/,/„-/) + (/„ -/,/„- /) P ) = 0. 

n — >oc 

Proof. Let (/ n )neN be a sequence such that lim^^oo £{f n - f m , f n - 
fm) + {fn - fm, fn - fm)p = [i.e., (/ n )neN is Cauchy with respect to £(■, •) + 
(v)p]- There exists a subsequence (nk)keN and / S L2(T wt ,P) such that 
limk-too f nk = f, P-a.s., and lim fc ^ 00 (/ nfe - fJ Hk - /) P = 0. By Fatou's 
lemma, 

(7.3) / J(dT,dS)(f(S) - f(T)f < \immf£{f nk J nk ) < oo, 

J k — >oo 

and so / € T>*(£). Similarly, 
S(fn-fJn-f) 

(7-4) = / J(dT,dS) lim ((/„ - f nk )(S) - {f n - f nk ){T)f 

< lim inf £{f n - f n , f n - f n ) -> 

K— >00 

asn-> oo. Thus (/ n )neN has a subsequence that converges to / with respect 
to £ (•, •) + (•, -)p, but, by the Cauchy property, this implies that (/ n )neN itself 
converges to /. □ 

Let L denote the collection of functions / : T wt — > K. such that 
(7.5) sup |/(T)|<oo 

and 

f7fi\ \f(S)-f(T)\ ^ 
(7-6) sup — — — < oo. 

Note that C consists of continuous functions and contains the constants. 
It follows from (2.16) that C is both a vector lattice and an algebra. By 
Lemma 7.2 below, C C T>*(£). Therefore, the closure of (£,£) is a Dirichlet 
form that we will denote by (£,T>(£ )). 

Lemma 7.2. Suppose that {/ n }neN is a sequence of functions from T wt 
into M such that 

sup sup \f n (T)\ < oo, 

n£NTeT wt 
\fn(S)-f n (T)\ ^ 

sup sup — — < oo 

n£N S,TeT^\S^T ^GH wt Wi 1 ) 
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and 

lim f n = /, P-o.s., 

n— >oo 

/or some / : T wt R. Then {/ n } neN C V*{£), feV*(£), and 
lim (£(f n -/,/„-/) + (/„ -f,f n - /) P ) = 0. 

n — >oo 

Proof. By the definition of the measure J [see (6.3)] and the symmetry 
of J [Lemma 6.2(i)], we have that f n (x) — f n (y) —> f(x) — f(y) for J-almost 
every pair (x,y). The result then follows from part (iii) of Lemma 6.2 and 
the dominated convergence theorem. □ 

Before showing that (£,T>(£)) is the Dirichlet form of a nice Markov 
process, we remark that C, and hence T>(£), is quite a rich class of functions: 
we show in the proof of Theorem 7.3 below that C separates points of T wt 
and hence if K is any compact subset of T wt , then, by the Stone-Weierstrass 
theorem, the set of restrictions of functions in C to K is uniformly dense in 
the space of real- valued continuous functions on K. 

The following theorem states that there is a well-defined Markov process 
with the dynamics we would expect for a limit of the subtree prune and 
regraft chains. 

Theorem 7.3. There exists a recurrent P -symmetric Hunt process X = 
(X tl F T ) on T wt whose Dirichlet form is (£,V(£)). 

Proof. We will check the conditions of Theorem 7.3.1 in [22] to estab- 
lish the existence of X. 

Because T wt is complete and separable (recall Theorem 2.5) there is a 
sequence Hi C H2 C ■ • ■ of compact subsets of T wt such that P(U fcgN H&) = 
1. Given a, (3 > 0, write C a ^ for the subset of C consisting of functions / 
such that 

(7.7) sup \f(T)\<a 

and 

(7-8) sup — — — < (3. 

By the separability of the continuous real- valued functions on each with 
respect to the supremum norm, it follows that for each k E N there is a 
countable set L a ^^ C £ a /3 such that for every / G £ a p 

(7.9) inf sup \f(T)-g(T)\=0. 
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Set L a>/ 3 := lJ fcgN L a ^fc. Then for any / G C a ^ there exists a sequence 
{/njneN in L a]/3 such that linin^oo/n = / pointwise on UfceN H fc> an d hence 
P-almost surely. By Lemma 7.2, the countable set U mgN L mi?n is dense in 
C, and hence also dense in T>(£), with respect to £(•, •) + (•, -)p. 

Now fix a countable dense subset S C T wt . Let M denote the countable 
set of functions of the form 

(7.10) T^p + q(A Gn ^{S,T) Ar) 

for some S G S and p,q,r G Q. Note that M C £, that M separates the 
points of T wt and, for any T G T wt , that there is certainly a function / G M 
with /(T) / 0. 

Consequently, if C is the algebra generated by the countable set M U 
U m eN L m>m , then it is certainly the case that C is dense in T>{£) with respect 
to £(■, •) + (•, -)p, that C separates the points of T wt and, for any T G T wt , 
that there is a function / G C with f(T) / 0. 

All that remains in verifying the conditions of Theorem 7.3.1 in [22] is to 
check the tightness condition that there exist compact subsets Ki C K2 Q ■ ■■ 
of T wt such that linin^oo Cap(T wt \ K n ) = 0, where Cap is the capacity 
associated with the Dirichlet form — see Remark 7.4 below for a definition. 
This convergence, however, is the content of Lemma 7.7 below. 

Finally, because constants belong to T>(£), it follows from Theorem 1.6.3 
in [22] that X is recurrent. □ 

Remark 7.4. In the proof of Theorem 7.3 we used the capacity asso- 
ciated with the Dirichlet form (£,£>(£)). We remind the reader that for an 
open subset U C T wt , 

Cap(U) := inf{£ (/, /) + (/, /)p : / G V(£), f(T) > 1, P-a.e. T G U}, 

and for a general subset A C T wt 

Cap(A) := inf{Cap(U) : A C U is open}. 

We refer the reader to Section 2.1 of [22] for details and a proof that Cap 
is a Choquet capacity. 

The following results were needed in the proof of Theorem 7.3. 

Lemma 7.5. For e,a,S> 0, put V £>a := {T G T : fi T (R £ (T)) > a} and, 
as usual, a := {T G T : oIqh(T, ~V £ ,a) < $}■ Then, for fixed e > 35, 

a>0 
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Proof. Fix SeT. If S G Vf a , then there exists T G V £ja such that 
dGn(S,T) < 5. Observe that R e (T) is not the trivial tree consisting of a 
single point because it has total length greater than a. Write {yi, ■ ■ ■ ,y n } 
for the leaves of R E (T). For all i = 1, ...,n, the connected component of 
T \ R £ (T)° that contains yi contains a point «j such that d,T(yi,Zi) = e. 

Let 5? be a correspondence between S 1 and T with dis(K) < 25. Pick 
Xi, . . . ,x n G S such that (x,, 2j) G 3?, and hence (^(xj, x.,-) — dr{zi, Zj)\ < 25 
for all 

The distance in R e (T) from the point y& to the arc [yi,yj] is 

(7.11) \{d s {yk,yi) + d s {yk,yj) - d s (yi,yj)). 

Thus the distance from y^, 3 < A; < n, to the subtree spanned by yi, . . . , y^-i 
is 

(7.12) /\ ^{d T {yk,yi) + dr(yk,yj) -d T {yi,yj)), 

i<«<i<fc-i 

and hence 

/i T ( J R £ (T)) = d T ( ?/1 ,y 2 ) 

( 7 - 13 ) 

+ A ^(d T {yk,yi) +d T (yk,yj) -d T (yi,yj))- 

k=31<i<j<k-l 

Now the distance in 5 from the point Xk to the arc [xi , Xj] is 

\{ds{xk,Xi) + ds(x k ,Xj) - d s (xi,Xj)) 

> \{dT{zk,Zi) +dr{zk,Zj) -dr{z%,Zj) - 3 x 25) 

(7.14) 

= 2( d T(yk,yi) + 2e + d T {yk,yj) +2e-rf T (y i ,y J ) - 2e - 6<5) 
>0 

by the assumption that e > 35. In particular, leaves of the 

subtree spanned by {x±, . . . ,x n }, and R*y(S) has at least n leaves when < 
7 < 2e — 65. Fix such a 7. 
Now 

//(# 7 (S)) 

> ds(xi,x 2 ) - 27 

n 

(7.15) + ^ f\ [l(ds(xk,Xi) +d s (x k ,Xj) -d s (xi,Xj)) -7] 

fc=31<i<j<fc-l 

> fi T (R £ {T)) + (2e - 25 - 2 7 ) + (n - 2)(e - 3<5 - 7) 

> a + (2e - 25 - 2 7 ) + (n - 2)(e - 3<5 - 7). 
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Because /j, s (R^(S)) is finite, it is apparent that S cannot belong to Vf a 
when a is sufficiently large. □ 

Lemma 7.6. For e,a > 0, let V £ja be as in Lemma 7.5. Set U £ja := 
{(T, z/) G T wt : T G V e>a }. TYien, /or /iz'ed e, 

(7.16) lim Cap(U e a ) = 0. 

Proof. Observe that (T,cIt,vt) •— > h Re ^ t \T) is continuous (this is es- 
sentially Lemma 7.3 of [20]), and so U £ia is open. 

Choose 5 > such that e > 35. Suppressing the dependence on e and 5, 
define u a : T wt [0, 1] by 

(7.17) u a ((T,v)) := 5-\5 - d GU (T,V £:a )) + . 

Note that u a takes the value 1 on the open set U e>a , and so Cap(U ?ia ) < 
£(u a ,u a ) + (u a ,u a )p. Also observe that 

\u a ((T',v'))-u a ((T",v"))\<5- 1 d G n(T',T") 

(7.18) 

<r 1 A Girrt ((T' J i/),(r' / ,i/ / )). 

It therefore suffices by part (iii) of Lemma 6.2 and the dominated conver- 
gence theorem to show for each pair ((T', z/), (T" , v")) G T wt x T wt that 
u a ((T' , u')) — u a ((T" is for a sufficiently large and for each T G 
T wt that u a ((T, u)) is for a sufficiently large. However, u a ((T' — 
u a ((T", v")) + implies that either V or T" belongs to V| jB , while u a ((T, v)) ± 
implies that T belongs to V £a . The result then follows from Lemma 7.5. 
□ 

Lemma 7.7. There is a sequence of compact sets Ki C K2 C ••• such 
that lim^oo Cap(T wt \ K n ) = 0. 

Proof. By Lemma 7.6, for n = 1,2,... we can choose a n so that 
Cap(U 2 -» 0n ) < 2-». Set 

(7.19) F n := T wt \ U 2 -n, an = {(T, v) G T wt : fi T (R 2 ^(T)) < a n } 
and 

(7.20) K„ := P| F m . 

m>n 

By Proposition 2.4 and Lemma 2.6, each set K n is compact. By construction, 



Cap(T wt \ K n ) = Cap (J U 3 - 

V m>n 
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(7.21) < Cap(U 



Q — m n 



m>n 



< 9 _m = 2~( n_1 ) 

~ ^ n 

m>n ' — 1 

8. The trivial tree is essentially polar. From our informal picture of the 
process X evolving via rearrangements of the initial tree that preserve the 
total branch length, one might expect that if X does not start at the trivial 
tree To consisting of a single point, then X will never hit To. However, an 
SPR move can decrease the diameter of a tree, so it is conceivable that, in 
passing to the limit, there is some probability that an infinite sequence of 
SPR moves will conspire to collapse the evolving tree down to a single point. 
Of course, it is hard to imagine from the approximating dynamics how X 
could recover from such a catastrophe — which it would have to since it is 
reversible with respect to the continuum random tree distribution. 

In this section we will use potential theory for Dirichlet forms to show 
that X does not hit To from P-almost all starting points; that is, that the 
set {T)} is essentially polar. 

Let al be the map which sends a weighted R tree (T, d, v) to the i/-averaged 
distance between pairs of points in T. That is, 

(8.1) d({T,d,v)):= [ [ u(dx)u(dy)d(x,y), (T, d, v) G T wt . 

JT JT 

In order to show that To is essentially polar, it will suffice to show that the 
set 

(8.2) {(T,d,v)GT™ t :d((T,d,v)) = 0} 
is essentially polar. 

Lemma 8.1. The function d belongs to the domain T>(£). 

PROOF._ If we let d n ((T,d,v)) := J T J T u(dx)u(dy)[d(x,y) An], for n G N, 
then d n f d, P-a.s. By the triangle inequality, 

2 



(8.3) (d,d) P < [ P(dT)(diam(T)) 2 < / W(de)(± sup e(t)) < oo, 
J J \ te\o.i\ 

and hence d n — ► d as n — > oo in L 2 (T wt , P). 



te[o,i] 

Notice, moreover, that for (T,d,v) G T wt and u,v G T, 



(d((T,d,u)) — d(Q((T,d,u),u,v))) 2 

(8.4) =2 [ [ v{dx)v(dy){d(y,u)-d(y,v)) 2 

JS T ' U ' V JT\S T ' U ' V 

= 2v T (S T ' u > v )is(T\S T > u ' v ) d 2 {u,v). 
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Hence, applying Corollary 5.2 and the invariance of the standard Brownian 
excursion under random rerooting (see Section 2.7 of [3]), 



'pwt y/ rpwt 

= 2 

x 



J(dT,dS)(d(T) -d(S)) 2 
P(dT) 

iy T (dv)p T (du)iy T (S T ' u ' v )u T (T \ S T ' U ' V ) 4(u, v) 



TxT 



1.5) < 2 J P(de)2 J 



ds <S> da 



r e s(e,s,a) - s(e,s,a) 
dp 



2vr7o ^{l-p)p A 



e")p{l -/9)(sup5i_ p e' 



H\2 



2vr Jo 



1 77T^ S ^ " ^ / P ( & ) f SUP 2 < 



Consequently, by dominated convergence, £ (d — d n , d — d n ) — > as n — > oo. 
It is therefore enough to verify that d n £ £ for all n£N. Obviously, 



(8.6) 



sup d n (T) < n, 



and so the boundedness condition (7.5) holds. To show that the "Lipschitz" 
property (7.6) holds, fix e > 0, and let (T,u T ),(S,u s ) G T wt be such that 
^GH wt (O^ 1 > ^t)) (S, vs)) < e - Then there exist f £ s and g £ Fg T such 
that d-p{vT,g*vs) < £ and dp(/*^T 5 *<s) < & [recall Fj. s from (2.10)]. Hence 

\d n ((T,v T ))-d n ((S,v s ))\ 

< 



(8.7) 



9(5) Jg(S) 



+ 



u T (dx)u T (dy)(d T (x,y) An) 

g*v s (dx)g*v s {dy)(d T (x,y) An) 
g*v s (dx)g*v s (dy)(d T (x, y) A ra) 
iy s (dx')u s (dy'){ds(x',y') An) 



9(S) Jg{3) 



SJS 



For the first term on the right-hand side of (8.7) we get 
u T (dx)f T {dy)(d T (x,y) An) 



TJT 
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9(S) Jg(S) 



< 



g*v s (dx)g*v s (dy)(d T (x,y) A n] 



TJT 



v T (dx)u T {dy)(d T {x,y) An) 



TJg(S) 



v T (dx)g*vs{dy)(d T {x,y) An) 



5( 9 ) Jt 



g*v s {dx)v T {dy)(d T {x,y) An) 



g*vs{dx)g*vs{dy){d T {x,y) An) 

g(S) Jg(S) 

By assumption and Theorem 3.1.2 in [19], we can find a probability mea- 
sure v on T x T with marginals it and g*vs such that 

(8.9) v{(x,y):d T (x,y)>e}<e. 

Hence, for all x G T, 



v T (dy)(d T {x,y) An) - / g*v s (dy)(d T (x, y) A n) 

T Jg(S) 



3.10) 



< / v(d(y,y'))\(d T (x,y) An) - (d T (x,y') An)\ 

JTxg(S) 

<[ v(d(y,y'))(d T (y,y')An) 
JTxg(S) 



< (l + (diam(T) An))-e. 



For the second term in (8.7) we use the fact that g is an e-isometry, that 
is, \(d s (x' ,y') An) - {dr{g(x'),g(y')) An)\ < e for all x',x" G T. A change of 
variables then yields that 



g*v s (dx)g*v s (dy){dT(x,y) A n) 



g{S)Jg{S) 



3.11] 



ISJS 



v s {dx')v s (dy')(d s {x' ,y') An) 

g*v s {dx)g*v s {dy){d T (x,y) A n) 



g(S)Jg(S) 

vs(dx')vs(dy')(d T (g(x'),g(y'))An] 



SJS 



■ s. 
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Combining (8.7) through (8.11) yields finally that 

\d n ((T,u T ))-d n ((S,u s ))\ 



5.12) 



sup 

(T,vt)*(S,v 3 ) 



6T wt A GH wt((T>T),0S>s)) 



<3 + 2n. 



□ 



Proposition 8.2. The set {T e T wt :d(T) = 0} is essentially polar. In 
particular, the set {Tq} consisting of the trivial tree is essentially polar. 

Proof. We need to show that Cap({T £ T wt : d(T) = 0}) = (see The- 
orem 4.2.1 of [22]). 
For e > set 



5.13) 



{TeT wt :rf(T) <£}. 



By the argument in the proof of Lemma 8.1, the function d is continuous, 
and so W £ is open. It suffices to show that Cap(W e ) { as e [ 0. 
Put 



5.14) 



u e (T) :-- 



d(T) 



Tg T 



wt. 



Then u G 2?(f ) by Lemma 8.1 and the fact that the domain of a Dirichlet 
form is closed under composition with Lipschitz functions. Because u e {T) > 
1 for T G W e , it thus further suffices to show 



5.15) 



lim(£ (u e ,u e ) + (u £ ,u £ )p) = 0. 

ej.0 



By elementary properties of the standard Brownian excursion, 
(8.16) (u e , u e ) F <4P{T: d(T) < 2e} ^ 

as £ I 0. Estimating £(u e ,u £ ) will be somewhat more involved. 

Let E and E be two independent standard Brownian excursions, and let 
U and V be two independent random variables that are independent of E 
and E and uniformly distributed on [0, 1]. With a slight abuse of notation, 
we will write P for the probability measure on the probability space where 
E, E, U and V are defined. 

Set 



5.17) 



D :=4 
H:=2 
D:=4 



ds <g> dt 



0<s<i<l 

dtE t , 



[0,1] 



ds ® dt 



0<s<t<l 



E s + E t -2 inf E u 

s<w<t 



E s + E t -2 inf E u 

s<w<t 
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H v := 2 / dt 
J[0,1] 



H v :=2 / dt 
J[o,i] 



Et + Eu-2 inf E v 

UAt<W<UVt 



E t + E v -2 inf E u 

VM<w<VVt 



For < p < 1 set 



(8.18) 
and 
(8.19) 
Then 

(8.20) 



+ 2(1 - p)p^H + 2(1 - p)py/T=pH L 



D v {p) := (1 - pYV^pD + p 2 ^pD 



+ 2(1 - p)p^P# + 2(1 - p)py/T=pH- 



V ■ 



£(U £ ,U £ ) 



1 



2V2vr 



dp 



y/{l-p)fr> 

Du(p)\ 



D v (p) 



2i 



£ /+ V e 

Fix < a < i and write a = 1 — a for convenience. We can write the rig 
hand side of (8.20) as the sum of three terms I(e,a), II(e,a) and III(e, 
that arise from integrating p over the respective ranges 

(8.21) {p-Du(p) VD v (p)<2e,0<p<a}, 

(8.22) {p : Du{p) A D v (p) <2e< D v {p) V D v (p),0 < p < a}, 
and 

(8.23) {p:a<p<l}. 

Consider I(e, a) first. Note that if Djj{p) V Dy(p) < 2e, then 



?.24) 



2 



A/(pA 



A/(p)\ 



2 <2 2 4{^-^} 2 - 



£ /+ V e /- 
Moreover, 

{0 < p < a : Dtf(p) V Dy(p) < 2e} 

C {0 < p < a : (1 - pfl 2 D + 2(1 - pfl 2 p(fiu V Fy) < 2e} 

C {0 < p < a : a 5/2 D + 2a 3/2 p(Hu V H v ) < 2e} 
(2e-a 5 / 2 £>)_ 



5.25) 



p:0<p< 



2a 3 / 2 (HuVH v ) 



A a 
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Thus I(e, a) is bounded above by the expectation of the random variable that 
arises from integrating 2 2 p 2 {Hu — Hy} 2 /e 2 against the measure ] , dp 

2V2lT (l-p)p' i 

over the interval [0, (2e - a 5 / 2 D) + /(2a^ 2 {Hu V H v ))}. Note that 

(8-26) [ X ^=p« = ^—x^ 2 , ay 1 -. 

Jo VP a -1/2 2 



Hence, letting C denote a generic constant with a value that does not depend 
on e or a and may change from line to line, 



I(e,a) <CP 



{2e-a^ 2 b) + \ i l 2 {H u -H v y 



r2 



H V VH V J e l 
<^¥[(2s-a"/ 2 D)f (HuVHy) 1 / 2 ] 



(8.27) < -^[{Hu + H v ) 1/2 t{D < 2a~ 5/2 e}} 
< -^^{D l,2 t{b < 2a^' 2 e}} 

£ l/2 

<CF{b<2a~ 5/2 e}, 
where in the second last line we used the fact that 

(8.28) W[Hu\E] = W[Hv\E] = D, 

and Jensen's inequality for conditional expectations to obtain the inequali- 
ties ¥[Hl/ 2 \E] < D 1 ' 2 and F[Hy /2 \E] < D 1 / 2 . Thus, lim £i0 ^(e,a) = for any 
value of a. 

Turning to II (e, a), first note that b < AH and, by the triangle inequality, 

(8.29) b<2{Hu AH V ). 

Hence, for some constant K that does not depend on e or a, 

(8.30) \Du(p) A D v (p) -D\< K(Hp 3 / 2 + (H v A H v )p) 
and 

(8.31) \Du(p) V D v (p) -D\< K(Hp 3 / 2 + (Hu V H v )p). 

Combining (8.31) with an argument similar to that which established 
(8.25) gives, for a suitable constant K* , 

{0 < p < a : Du(p) A D v (p) < 2e < Du{p) V D v (p)} 

= {0 < p < a : 2e < D v (p) V D v (p)} 

(8.32) n{0< p< a: D v (p) AD v (p) < 2e} 



c 



SUBTREE PRUNE AND REGRAFT 
(2e-D) + 
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P- 



K*{H + HjjV H v ) 



< p < a 



f 2e - a 5 / 2 D )+ 

I 2a s / 2 (H v AH V ) 

Moreover, by (8.30) and the observation |(2e — x)+ — (2e — y) + \ < \x — y\, 
we have for D v (p) A D v (p) <2e< D v (p) V D v (p) that 



Du(p) \ 
£ ). 



D V (P) 



5.33) 



Du(p)ADv{p) 



<^{(2e-D) + } 2 

+ \{{2e - Du(p) A IV(P))+ " (2e - D) + f 
< 4{(2^ - D) + } 2 + ^{Du(p) A D v (p) - D} 2 



C 



< ^ [(2e - D)\ + H 2 p 6 + (H v A Hyf p 2 " 



for a suitable constant C that does not depend on e or a. It follows from 
(8.26) and 

1 



(8.34) 
that 



//(e,a) < — 1 



1/2-/3 



r^-V2_ ^-l/ 2l 



/?4 



(2e-£>) 



5.35) 



run 



(2e-£>) + 
H + HuWHy 
f (2e-a 5 / 2 £>) + 
l2a 3 / 2 (^ A#y) 



-1/2 



5/2- 



A a 



(#[/ A 



,f (2 £ -a 5 / 2 £>)+ I 3 / 2 ] 
r \2a3/2(^AJ?v) Aa 



for suitable constants C", C" and C". 

Consider the first term in (8.35). Using Jensen's inequality for conditional 
expectations and (8.28) again, this term is bounded above by 



5.36) ^P[(2e - bf^iAD 1 ' 2 + B}} < pP[(2e - bfj 2 {2 1 / 2 Ae 1/2 + 
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for suitable constants A, B. Now, by Jensen's inequality for conditional ex- 
pectation yet again, along with the invariance of standard Brownian excur- 
sion under random rerooting (see Section 2.7 of [3]) and the fact that 

(8.37) P{E V e dr} = re~ r2/2 dr 

(see Section 3.3 of [3]), we have 

P[(2e - Dfl 2 ] 



2£-2{Eu + E v -2 inf E t 

UAV<t<uvv 









} 


E 


)11 









5.38) 



< 



2e-2{ E v + E v -2 inf E t 

UAV<t<UW 



3/2-1 



P[(2e -2E V ) 



3/2-, 



drre 



-r 2 /2 



(2e - 2r) 



3/2 



< ( £ drr(2e-2rfl 2 = 2^ 2 e 7 l 2 C ds s(l - sf/ 2 . 
Jo Jo 

Thus the limit as e \ of the first term in (8.35) is for each a. 

For the second term in (8.35), first observe by Jensen's inequality for 
conditional expectation and (8.37) that 



*{£> <r}< 



5.39) 



D 



< 



Eu 



(2r) 2 

< 2¥{E V < 2r} < 2^- = 4r 2 . 



Combining this observation with (8.29) and integrating by parts gives 
"f (2e - d 5 / 2 D), W 2 l 



5.40) 



1 2a 3 / 2 (H v A H 



A a 



V 



< 



(2e-a 5 / 2 D) + | 5 / 2 ] 



i 3 / 2 D 



A a 



i 



2e/a 5 / 2 



F{D £dr}[d 



-3/2 ( 



2e 



-5/2 



< 



2e/a 5 / 2 

dr4r 2--15/4 
2e/(aa 1 /2+a5/2) 2 



V r 
5 (2e 



A a 



5/2 



=5/2 



3 / 2 2£ 



40e 2 a- 15 / 4 



1/a 5 / 2 

l/(aa 1 /2 + a5/2) 
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If we denote the rightmost term by L(e,a), then it is clear that 

(8.41) limlim^rLfe.a) = 0. 

From (8.28) and Jensen's inequality for conditional expectations, the third 
term in (8.35) is bounded above by 

°F[(Hu A H v ) 1 / 2 (2e - a^D) 3 / 2 ] < ^F[D 1 / 2 (2e - a 5 / 2 L)f_/ 2 ] 

(8.42) 6 



and the calculation in (8.38) shows that the rightmost term converges to 
zero as e [ for each a. 

Putting together the observations we have made on the three terms in 
(8.35), we see that 

(8.43) limlim//(e,a) = 0. 

aj,0 ej,0 

It follows from the dominated convergence theorem that 

(8.44) lim III(e, a) =0 

for all a, and this completes the proof. □ 
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